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Abstract. We describe and develop three recent novelties in network research which are 

• <-h . particularly useful for studying social systems. The first one concerns the discovery of some 

basic dynamical laws that enable the emergence of the fundamental features observed in 
' social networks, namely the nontrivial clustering properties, the existence of positive degree 

correlations and the subdivision into communities. To reproduce all these features we describe 
a simple model of mobile colliding agents, whose collisions define the connections between 
the agents which are the nodes in the underlying network, and develop some analytical 
considerations. The second point addresses the particular feature of clustering and its 
' relationship with global network measures, namely with the distribution of the size of cycles 

in the network. Since in social bipartite networks it is not possible to measure the clustering 
from standard procedures, we propose an alternative clustering coefficient that can be used 
■ to extract an improved normalized cycle distribution in any network. Finally, the third point 

, addresses dynamical processes occurring on networks, namely when studying the propagation 

c/3 ' of information in them. In particular, we focus on the particular features of gossip propagation 

which impose some restrictions in the propagation rules. To this end we introduce a quantity, 
the spread factor, which measures the average maximal fraction of nearest neighbors which 
^ (— i get in contact with the gossip, and find the striking result that there is an optimal non-trivial 

<"S |. number of friends for which the spread factor is minimized, decreasing the danger of being 

gossiped. 

• Jh - 

t 

d PACS numbers: 89.65.-s, 89.75.Fb, 89.75.Hc, 89.75.Da 
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1. Introduction 

Contrary to what may be perceived at a first glance, social and physical models were brought 
together several times, during the last four centuries. In fact, not only Maxwell and Boltzmann 
were inspired by the statistical approaches in social sciences to develop the kinetic theory of 
gases, but one can even cite the English philosopher Thomas Hobbes, who already in the 
seventeenth century, using a mechanical approach, tried to explain how people acquaintances 
and behaviors may contribute to the evolution towards a stable absolute monarchy [TU [21 . More 
than making a historical perspective if these approaches were successful and correct or not, it 
is almost unquestionable that, at a certain level, there are social phenomena that could be more 
deeply understood by using approaches of statistical and physical models. Recently |[3l [4] [5]|, 
this perspective gained considerable strength from the increased interest on - and in several 
senses well-succeed - network approach, where one describes complex systems by mapping 
them on a graph (network) of nodes and links and studies their structure and dynamics with the 
help of some statistical and topological tools from statistical physics and graph theory J6l 13. 

When addressing the specific case of a social system, nodes represent individuals and 
the connections between them represent social relations and acquaintances of a certain kind. 
Social networks were studied in different contexts [£SL [H [TUl [EH [12l [131 EH, ranging from 
epidemics spreading and sexual contacts to language evolution and vote elections. However, 
although they are ubiquitous, social networks differ from most other networks, yielding a still 
broad spectrum of unanswered questions and improvements to be done when studying their 
statistical and topological properties. In this paper, we will address three fundamental open 
questions related to the typical structure and dynamics associated to social networks. 

The first open question has to do with the modeling of social networks. The recent 
broad study of empirical social networks has shown that they have three fundamental 
features common to all of them lfT5l . First, they present the small- world effect lfT6l with 
small average path lengths between nodes and high clustering coefficients meaning that 
neighbors tend to be connected with each other. Second, they have positive correlations: 
the highly (poorly) connected nodes tend to connected to other highly (poorly) connected 
nodes. Third and last, invariably one observes an organization of the network into some 
subsets of nodes (communities) more densely connected between each other. Although there 
are arguments pointing out that all these features could be consequence from one another |[T5l . 
the modeling of specific social networks reproducing quantitatively all these features has not 
been successful. Using a recent approach to construct networks, based on a system of mobile 
agents, it is possible to reproduce all these features. In Section [2] we will further show that the 
degree distributions characterizing social networks typically follow a specific one-parameter 
distribution, so-called Brody distribution. 

The second question is related to the intrinsic nature of the nodes. For certain social 
networks there are intrinsic features of the individuals which must be considered in the 
analysis. For instance, the gender in networks of sexual contacts[14] or the hierarchical 
position in a network of social contacts inside some enterprise. From the network point of 
view this distinction means to introduce multipartivity in the network, biasing the preferential 
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attachment between nodes that tend to connect with nodes of a certain type. When there 
are two types of nodes, e.g. men and women, and the connections between them is strongly 
related to this type, e.g. men can only match women and vice-versa, the standard measures to 
analyze network structure fails. In particular, the standard clustering coefficient lfT6l . is unable 
to quantify the connectedness of broader neighborhoods that typically appear in multipartite 
networks. In Sec. [3] we will revisit some of the clustering coefficients used to study clustering 
in bipartite networks, and show how the combination of both clustering coefficients can 
yield good estimates of normalized cycle distributions. Moreover, we will discuss a general 
theoretical picture of a global measure of increasing order of clustering coefficients according 
to some suitable expansion. 

The third open question has to do with the heterogeneity of nodes in what concerns their 
influence in the connections and therefore in the propagation phenomena on social networks. 
In rumor propagation lfTTll . for instance, one usually treats all connections equally in the spread 
of some signal (opinion, rumor, etc). This is a suitable assumption for situations like the 
spread of an opinion which is equally interesting to all nodes in the network, for example 
political opinions to some vote election. However, there are also several social situations 
where the signal is not equally interesting to all nodes, as the case of spreading of some 
gossip about some common friend. In these cases there are connections which will be more 
probably used to spread the signal than others, since not all our friends are also friends of the 
particular person which is being gossiped about and therefore, either we tend to not tell the 
gossip to them or they tend to not spread it even if they hear it. In Sec. @] we will present a 
simple model for gossip propagation and describe some striking features. Namely, that there 
is an optimal number of friends, depending on the degree distribution and degree correlations 
of the entire network, for which the danger to be gossiped is minimized. 

Finally, in Sec. [5] we make final conclusions, giving an overview of future questions 
which could be studied in social networks arising from the topics studied throughout the 
paper. 

2. Modeling social networks: an approach based on mobile agents 

Since the study of social networks is mainly concerned with topological and statistical features 
of people's acquaintances (HI |9), the modeling of such networks has been done within the 
framework of graph theory using suitable probabilistic laws for the distribution of connections 
between individuals (31 ffi [51 13. This approach proved to be successful in several contexts, for 
instance to describe community formation lfT8l [T9l and their growth [|20l . 

However, they present two major drawbacks. First, the graph approach may be suited to 
describe the structure of social contacts and acquaintances, but lacks to give insight into the 
social dynamical laws underlying it. Second, these models seem to be unable to reproduce all 
the main features characteristic of social networks, at least at the fundamental level. In this 
context, it was pointed out that ETl l22l l23l dynamical processes based on local information 
should be also considered when modeling the network. Our recent proposal to overcome these 
shortcomings was to construct networks, from a system of mobile agents following a simple 
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Figure 1. Illustration of the two-dimensional mobile agents system. Initially there are no 
connections between nodes and nodes move with some initial velocity vq in a randomly chosen 
direction (arrows). At t = 1 two nodes, Pi and P^ collide and a connection between them is 
introduced (solid line), velocities are updated increasing their magnitude and choosing a new 
random direction. At t = 2 two other collisions occur, between nodes P^ and P4 and between 
nodes Pi and P3. In this way a network of nodes and connections between them emerges as a 
straightforward consequence of their motion (see text). 



motion law lfP3l \14\ . Here, we briefly review this model and further present the analytical 
expression that fits the obtained degree distributions. In particular we show that the degree 
distribution typically follows a Brody distribution EH. 

2.1. The model 

The model is given by a system of particles (agents) that move and collide with each other, 
forming through those collisions the acquaintances between individuals. Consequently, the 
network results directly from the time evolution of the system and is parameterized by two 
single parameters, the density p of agents characterizing the system composition and the 
maximal residence time Tg controlling its evolution. Each agent i is characterized by its 
number h L of links and by its age A*. When initialized each agent has a randomly chosen 
age, position and moving direction with velocity v and one sets fcj = 0. While moving, 
the individuals follow ballistic trajectories till they collide. As a first approximation we 
assume that social contacts do not determine which social contact will occur next. Therefore, 
after collisions, the total momentum should not be conserved, with the two agents choosing 
completely random new moving directions. Figure \T\ sketches consecutive stages of the 
evolution of such a system of mobile agents. 

Assuming that large number of acquaintances tend to favor the occurrence of new 
contacts, the velocity should increase with degree k, namely 

v{h) = (vkt + v )u, (1) 



Social networks: models and measures 



5 




1 i 1 i 
(b) 



<k> 



<k> 



• - 150 



I i I i I i I i I i I i I n 
10 20 30 40 50 60 700 10 20 30 40 50 60 70 



Figure 2. Bridging between real social networks with average degree (k) and the system of 
mobile agents that reproduce their topological and statistical features. In (a) the normalized 
maximal residence time of agents is plotted as a function of the average degree, while in (b) 
one plots the collision rate A which is a unique function of the residence time, and scales with 

(*)• 



where v = 1 m/s is a constant to assure dimensions of velocity, uj = (e x cos 9 + e y sin 9) 
with 9 a random angle and e x and e y are unit vectors. The exponent a in Eq. (Q]) controls 
the velocity update after each collision. Here, we consider a — 1. Further, the removal of 
agents considered here is simply imposed by some threshold Ti in the age of the agents: when 
Ai = T t agent i leaves the system and a new agent j replaces it with kj = 0, Vj = v and 
randomly chosen moving direction. The selected values for Tg must be of the order of several 
times the characteristic time r between collisions, in order to avoid either premature death 
of nodes. Too large values of T e are also inappropriate since in that case each node may on 
average collide with all other nodes yielding a fully connected network. 

Similarly to other systems ll25l l26l . this finite Tg enables the entire system to reach a 
non-trivial quasi- stationary state[13]. In fact, only by tuning Tg within an acceptable range 
of small density values, one reproduces networks of social contacts. In Fig. [2k one sees 
the normalized residence time Tg/r as a strictly monotonic function of the average degree 
(k). From the residence time it is also possible to define a collision rate, as the fraction 
between the average residence time T t — (A(0)) = T e /2 and the characteristic time r, namely 
A = Tg/(2r) = (v)Ti/(2voTo), where t is the characteristic time of the system at the 
beginning when all agents have velocity v . Figure [2b shows clearly that A = 2(k). 

By looking at Fig. [2] one now understands the main strength of the mobile agent model 
here described: when taking a real network of social contacts and measuring the average 
degree (k) the correspondence sketched in Fig. [2] straightforwardly returns the suitable value 
of Ti that reproduces the topological and statistical features. 

It was already reported irT3l l27l that empirical networks extracted from a survey among 
84 American schools are easily reproduced with this mobile agent model, in what concerns the 
degree distribution, second-order correlations, community structure, average path length and 
clustering coefficient. As an illustration, Fig. [3] shows the degree distribution P(k) of nine of 
such schools (symbols). Such distributions are well fitted by Brody distributions (solid lines) 
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Figure 3. Degree distributions of nine different schools (symbols) from an in-school 
questionnaire involving a total of 90118 students which responded to it in a survey between 
1994 and 1995. Each school comprehends a number N of interviewed students and from 
their questionnaires an average number (k) of acquaintances is extracted. With solid lines 
we represent the fit obtained with a Brody distribution, Eq. 0, whose parameter value is 
computed in Fig. |4] 



defined as El: 



Mk) = lhPexp(-r / F +1 ), (2) 



with k — k/(k) and 



and B a normalization constant. Roughly, the Brody distribution in Eq. © is, apart some 
special constants, the product of a power of k with an exponential with a negative exponent 
proportional to a higher power of k. For the particular case [3 = 0, the Brody distribution 
reduces to the exponential distribution having always a non-positive derivative. 

The distributions in Fig. [3] were obtained with values of [3 slightly above zero, namely 
between zero and one as shown in Fig. |4j In this case one is able to obtain the non trivial 
positive slope which is typically observed for small k values in the degree distribution of 
such social networks. Interestingly, Fig. |4] also shows a linear trend between the average 
degree (k) in the network and the corresponding value of (3 which fits the degree distribution. 
This guarantees that distribution in Eq. © has indeed one single parameter. How such a 
distribution can be obtained from an analytical approach to the model of mobile agents is still 
an open question and will be discussed in detail elsewhere. 



Social networks: models and measures 



7 




Figure 4. The linear dependence between the parameter (3 of the Brody distribution in Eq. (fJI 
with the average number (k) of connections. Each bullet corresponds to one of the schools 
whose degree distribution is plotted in Fig. [3] The solid line yields the fit (3 = 0.094(fc}+0.078. 



3. Particular measures for social networks 

To measure "the cliquishness of a typical neighborhood" in a network, Watts and Strogatz lfT6l 
introduced a simple coefficient, called the clustering coefficient, which counts the number 
of pairs of neighbors of a certain node which are connected with each other, forming a 
cycle of size s = 3. While such tool enables one to access the structure of complex 
networks arising in many systems flH|7]|, helping to characterize small-world networks lfT6ll . 
to understand synchronization in scale-free networks of oscillators ll28l and to characterize 
chemical reactions E9l and networks of social relationships Il3"0ll3~ni . there are other situations 
where this measure does not suit. Namely, when the network presents a multipartite structure. 
For instance, when there are two different kinds of nodes and connections link only nodes of 
different type, the network is bipartite[[30l [3U ED and the bipartite structure does not allow 
the occurrence of cycles with odd size, in particular with s = 3. 

Bipartite networks are quite common for social sy stems ll32l[33l where the two different 
kinds of nodes represent e.g. the two genders. While the standard clustering coefficient in such 
networks is always zero, they have in general non vanishing clustering properties [31 J and 
therefore more appropriate quantities to access such networks have been proposed, namely 
coefficients counting larger cycles. In this Section, we will discuss how these different 
clustering coefficients are related to each other and how one can use them to improve the 
knowledge of the network structure. 

The standard clustering coefficient C3 is usually defined[16J as the fraction between the 
number of cycles of size s = 3 (triangles) observed in the network out of the total number of 
possible triangles which may appear, namely 

ki(ki 1) 

where t , is the number of existing triangles containing node % and fcj is the number of neighbors 
of node i, yielding a maximal number k^ki — l)/2 of triangles. 

To access the cliquishness in bipartite networks one has proposed ll2Tl I3T1 [321 [341 a 
clustering coefficient C 4 (i), sometimes called the grid coefficient |[34l . defined as the quotient 



Social networks: models and measures 



8 



between the number of cycles of size s = 4 (squares) and the total number of possible squares. 
Explicitly, for a given node i with two neighbors, say m and n, this coefficient yield s ll211 

i -\ Qimn 

V^m <limn)\" J n 'limn J ~ Himn 

where g« mn is the number of common neighbors between m and n (not counting z) and 
Vimn = 1 + fen + #mn with 6* mn = 1 if neighbors m and n are connected with each other and 
otherwise. 

After averaging over the nodes, the coefficients C3 and C4 characterize the contribution 
of the first and second neighbors, respectively, for the network cliquishness. In order to 
be a suitable quantity to measure the cliquishness of bipartite networks compared to their 
monopartite counterparts, C4 must behave the same way as C3 when the network parameters 
are changed, as it is indeed the case for (C 4 ) computed from Eq. ©. See Ref. 11211 for details. 

One should notice that in most m-partite networks, it is always possible to have cycles 
of size s = 4, indicating that C 4 is in some sense a more general clustering measure than C 3 . 
However, it could be the case that for a larger number of partitions forming the network, the 
contribution of larger cycles increases. This is the case, for instance, of trophic relations in 
an ecological network of different individuals from different species, where large cycles tend 
to be abundant, namely the ones ranging from the higher predators to the plants at the lowest 
trophic level. In such cases, a general clustering coefficient counting the fraction of possible 
cycles of arbitrary size n may be needed. The generalization is straightforward yielding a 
clustering coefficient C n = E n /L n , where E n is the number of existing cycles with size n, 
L n the maximal number of such cycles that is possible to be attained and n — 3, . . . , N for a 
network of N nodes. 

Having C n for the required values of n, one is able to introduce a general clustering 
measure of the network, given by the sum of all these contributions, namely 

N N E n 

C = ^a„C n = ^ocnj^, (6) 



where a n is a coefficient that weights the contribution of each different clustering order n 
and obeys 
Eq. © as 



and obeys the normalization condition J2n=3 «n = 1- In general one can write E n and L n in 



E n = J2 A r J P(^i)g(A;i,A; 2 )iVP(A; 2 )g(A; 2 ,A ; 3)...iVP(A ;n _ 1 )g(A ;n _ 1 ,A ;n ) (7) 

where are the total combinations of n elements out of N, P(k) is the fraction of nodes with 
k neighbors and q(ki, k 2 ) is the correlation degree distribution, i.e. the fraction of connections 
linking a node with k\ neighbors to a node with k 2 neighbors. 

From Eq. © one can assume approximately that E n ~ ((P) (q)N) n with (P) and (q) the 
average fractions of P(k) and q(ki, k 2 ) respectively. Since L n increases also as N n , a possible 
suitable choice for a would be a constant, namely a = 1/(N — 2) obeying the normalization 
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Figure 5. Illustrative examples of cycles (size s = 6) where the most connected node (o) is 
connected to (a) all the other nodes composing the cycle, forming four adjacent triangles. In 
(b) the most connected node is connected to all other nodes except one, forming two triangles 
and one sub-cycle of size s = 4, while in (c) the same cycle s = 6 encloses two sub-cycles of 
size s — 4 and no triangles (see text). 

condition above. Having presented this general scenario, we now concentrate on the two first 
clustering coefficients, C3 and C4, to address the cycle size distribution. 

We first show an estimate introduced in Ref. Il35l . which considers only the degree 
distribution P(k) and the distribution of the standard clustering coefficient C 3 (/c). One starts 
by considering the set of cycles with a central node, i.e. cycles with one node connected to 
all other nodes composing the cycle, as illustrated in Fig. [5^. The central node composes 
one triangle with each pair of connected neighbors. Due to this fact, the number of cycles 
with size s can be easily estimated, since the number of different possible cycles to occur is 
no(s, k) = B*_ x ^y^- , for a central node with k neighbors and the corresponding fraction of 
these cycles which is expected to occur is po(s, k) = C3(k) s ~ 2 , yielding a total number of 
s-cycles given by 

N s = Ng s P(k)M*> k )Po(s, k), (9) 

k=s-l 

where g s is a factor which takes into account the number of cycles counted more than once. 

The estimate in Eq. © is a lower bound for the total number of cycles since it considers 
only cycles with a central node. Further, this estimate only accounts for cycles up to size 
s < k max + 1, with k max the maximal degree and is not suited for bipartite networks where 
C?,(k) = for all k. Bipartite networks are typically composed of a set of nodes as those 
illustrated in Fig. (5t, where no central node exists. 

By using additionally the coefficient C 4 (/c) in a similar estimate, one is now able to take 
into account several cycles without central nodes. One first considers the set of cycles of size 
s with one node connected to all the others except one, as illustrated in Fig. [5t>. Assuming 
that this node has k neighbors, s — 2 of them belonging to the cycle one is counting for, one 
has ni(s, k) = B^_ 2 (s — 2)!/2 different possible cycles of size s. The corresponding fraction 
of such cycles which is expected to occur is given by pi(s, k) = C 3 (k) s ~ 4 'C4(k)(l — C 3 (k)). 
Writing an equation similar to Eq. ©, where instead of n (s, k) and po(s, k) one has nx(s, k) 
and pi(s, k) respectively and the sum starts at s — 2 instead of s — 1, one has an additional 
number N' s of estimated cycles which is not considered in estimate ©. 
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To improve the estimate further one repeats the same approach, taking out each time 
one connection to the initial central node, increasing by one the number of elementary cycles 
of size s = 4. Figure [5fc illustrates a cycle of size s = 6 composed by two elementary 
cycles of size 4. In general, for cycles composed by q sub-cycles of size 4 one finds 
n q (s, k) = % -Bs-g-i possible cycles of size s looking from a node with k neighbors 
and a fraction p q (s, k) = Cz{k) s ~ 2q ~ 2 Ci{k) q {l — Cs(k)) q of them which are expected to be 
observed. 

Summing up over k and q yields our final expression 

[«/2]-l k max 

N s = Ng s E P(k)n q (s,k)p q (s,k). (10) 

q=0 fc=s— q — 1 

where [x] denotes the integer part of x. In particular, the first term (q = 0) is the sum in Eq. © 
and the upper limit [s/2] — 1 of the first sum is obtained by imposing the exponent of C 3 (k) 
in p q (s, k) to be non-negative. 

The estimate in Eq. (fTOl ) not only improves the estimated number computed from Eq. ©, 
but also enables the estimate of cycles up to a larger maximal size[21], namely up to 
s = 2k max where k max is the maximal number of neighbors in the network. 

The estimate in Eq. (flOl) has also the advantage of being able to estimate cycles in 
bipartite networks. Since for bipartite networks C%(k) = 0, all terms in Eq. (flOl) vanish 
except those for which the exponent of C?,(k) is zero, i.e. for s = 2(q + 1) with q an integer, 
which naturally shows the absence of cycles of odd size in such networks. 

For highly connected networks, both estimates should nevertheless yield similar results, 
since in that case there is a very large number of both triangles and squares. For instance, 
the so-called pseudo-fractal network ll36l is a deterministic scale-free network, constructed 
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Figure 6. (a) The fraction N s /Ng s of the number of cycles estimated from Eqs. (O, dashed 
lines, and (TlOb . solid lines, compared with (b) the exact number of cycles as a function of 
the size for the pseudo-fractal network l36l . From small to large curves one has pseudo- 
fractal networks with m = 2,3,4,5 generations (see text). In (c) one sees the comparison 
between both estimates in a scale-free network with degree distribution P(k) = Pok~ 7 with 
P a = 0.737 and 7 = 2.5, and coefficient distributions C 3j4 (fc) = C^\k^ a with cf ] = 2, 
Cf ] = 0.33 and a = 0.9. 
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from three initial nodes connected with each other (generation m = 0), and iteratively adding 
new generations of nodes such that in generation m + 1 one new node is added to each 
edge of generation m and is connected to the two nodes joined by that edge. For these 
networks, the exact number of cycles with size s can be written iteratively [371 and can be 
directly compared to the one obtained with the two estimates above. Figure [6k shows the 
two estimates, while in Fig. [6b the exact number is computed. We notice that both the real 
number N s of cycles and the normalized value N s /(Ng s ), though different, yield the same 
shape. Thus, although the estimates above are not able to explicit the geometrical factor g s , 
the corresponding normalized distributions agree very well with the real one. However, while 
in this simple situation both estimates are similar, in general they can deviate significantly, as 
illustrated in Fig. [6fc. In such cases, the estimate (flOl) is closer to the real distribution of cycle 
sizes EH- 

4. Spreading phenomena in social networks 

In the previous Section we show how the study of network structure can be addressed by 
using tools as the clustering coefficient and first and second degree distributions. However, 
although the ability to communicate within a network of contacts is favored by the network 
topology 11381 . to study dynamical phenomena occurring on the network other measures are 
necessary. Here, we focus on novel properties that help to ascertain the broadness and speed 
of propagating phenomena through the network. We will describe two helpful quantities to 
study propagation in a network. As we will see these tools are particularly suited for a simple 
model of gossip propagation, that yields a striking result: in real social systems it is possible 
to minimize the risk of being gossiped, by only choosing an optimal number of friendship 
acquaintances. 

We start by introducing the additional quantities in the context of gossip propagation. As 
opposed to rumors, a gossip always targets the details about the behavior or private life of 
a specific person. Some information of a specific gossip is created at time t — about the 
victim by one of its neighbors. Since typically the gossip tends to be of interest to only those 
who know the victim personally, we consider first that it only spreads at each time step from 
the vertices that know the gossip to all vertices that are connected to the victim and do not 
yet know the gossip. Our dynamics is therefore like a burning algorithm |[39l , starting at the 
originator and limited to sites that are neighbors of the victim. The gossip will spread until all 
reachable neighbors of the victim know it, yielding a spreading time r. 

To measure how effectively the gossip or more generally the amount of information 
attains the neighbors of the starting node (victim), we define the spreading factor / given 
by 

f = n f /k (11) 

where nf is the total number of the k neighbors who eventually hear the gossip in a network 
with N vertices (individuals). Although similar in particular cases, the spreading factor / and 
the clustering coefficient are, in general, different because the later one only measures the 
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Figure 7. Semi-logarithmic plot of the spreading time r as a function of the degree k for (a) 
the Apollonian (n = 9 generations) and (b) the Barabasi-Albert network with N = 10 4 nodes 
for m = 3 (circles), 5 (squares) and 7 (triangles), where m is the number of edges of a new 
site, and averaged over 100 realizations. In the inset of (a) we show a schematic design of the 
Apollonian lattice for n = 3 generations. Fitting Eq. dT2l > to these data we have B = 1.1 in 
(a) and B = 5.6 for large k in (b). 



number of bonds between neighbors giving no insight about how they are connected. 

In Fig. [7J one sees how the spreading time r depends on the degree k of the starting 
node. The Apollonian network flOl is illustrated in Fig. [7k, while the case of Barabasi-Albert 
networks is given in Fig.[7J). In both cases r clearly grows logarithmically, 

r = A + Blogk, (12) 

for large k. In the case of the Apollonian network, one can even derive this behavior 
analytically as follows. In order to communicate between two vertices of the n-th generation, 
one needs up to n steps, which leads to r oc n. Since for the Apollonian network one has HUl 
k = 3 x 2™" 1 , one immediately obtains that r oc log A;. 

For the Apollonian network all neighbors of a given victim are connected in a closed path 
surrounding the victim, as can be seen from the inset of Fig.[7k, yielding / = 1. This stresses 
the fact that the spread factor / is rather different from the clustering coefficient which in this 
case is C = 0.828 ||40l 

Next, we will show that for these two features to appear one needs the existence of 
degree correlations between connected nodes, as usually observed in real empirical networks. 
In Fig. [8] we plot the results of gossip spreading on an empirical set of networks extracted 
from survey data lHTTl in 84 U.S. schools. Here, the logarithmic growth of r with k, shown in 
Fig. [8k, follows the same dependence of the average degree k nn of the nearest neighbors |[42l|. 
as illustrated in the inset. As in the case of the BA networks, we also find for the schools a 
characteristic degree k for which / and therefore the gossip spreading is smallest. The inset 
of Fig. [8b, however, gives clear evidence that the school networks are not scale-free. Since 
the same optimal degree appears in Barabasi-Albert networks, one argues that the existence 
of this optimal number is not necessarily related to the degree distribution of the network, but 
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Figure 8. Gossip propagation on a real friendship network of American students BT1 averaged 
over 84 schools. In (a) we show the spreading time t and, in the inset, the average degree of 
neighbors of nodes with degree k. In (b) the spread factor /, both as a function of degree k. In 
the inset of (b) we see the degree distribution P(k). 



rather to the degree correlations. 

However, the relation between degree correlations, measured by k nn , and the logarithmic 
behavior of the spreading time is not straightforward. While in the empirical network we 
found the same distribution for both k nn and r, in BA and APL networks k nn follows a 
power-law with k (not shown). As for the spread factor /, a mean field approach can 
be derived, yielding an /-rate equation which depends in general on P(k) and two and 
three-point correlations of the degree. In the case of uncorrected networks, two and three- 
point correlations reduce to simple expressions of the moments of the degree distribution. 
Therefore, / is independent of the degree, similarly to what is observed for the density of 
particles as derived by Catanzaro et al ll43l in diffusion-annihilation processes on complex 
networks. For correlated networks, as the empirical network here studied, the analytical 
approach is not straightforward and will be presented elsewhere. 

Another quantity of interest is the distribution P(r) of spreading times, which clearly 
decays exponentially for the Apollonian network, as illustrated in Fig. |9^. This behavior 
can be also obtained analytically by considering that P(r)dr = P(k)dk and using Eq. (fl2]l 
together with the degree distribution, P(k) oc A; -7 , to obtain 

P(r) oc e ^-T)/ s , (13) 

for large k. The slope in Fig.Hk is precisely (1 — "f)/B = —0.17 using B from Fig. [7^ and 
7 = 2.58 from Ref. HOI . For the school network, P(r) follows also an exponential decay for 
large r, but with a 3.5 times smaller characteristic decay time, and has a maximum for small 
r, as seen in Fig. [9fc (circles). Compared to the P(r) of the Barabasi-Albert network with 
m = 9 (solid line), the shapes are similar but the Barabasi-Albert case is slightly shifted to 
the right, due to the larger minimal number of connections. 

Many other regimes of gossip and of propagation phenomena can be also addressed with 
these two quantities. Namely, a more realistic scenario could be addressed by enabling each 
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Figure 9. Distribution P(t) of spreading times r for (a) the APL network of 8 generations, 
and (b) the real school network (circles) and the BA network with m = 9 and N = 1000 (solid 
line). 



node to transfer information with a probability < p < 1. Further, the assumption that 
the person to which a gossip did not spread at the first attempt, will never get it, yields a 
regime similar to percolation conditional to the neighborhood of the victim. Differently, if at 
each time-step the neighbors which already know the gossip repeatedly try to spread it to the 
common friends, one observes the same value of / measured for q = 1, and the spreading 
time scales as r' ~ r/q, where r is measured for q — 1. Finally, other possible regimes 
comprehend the situation where the gossip spreading over strangers, i.e. over nodes which are 
not directly connected to the victim. Such cases are being studied in detail and results will be 
presented elsewhere ll44l . 

5. Discussion and conclusions 

In this paper we presented and developed recent achievements in social network research, 
concerning the modeling of empirical networks, and specific mathematical tools to address 
their structure and dynamical processes on them. 

Concerning the modeling of empirical networks, we described briefly a recent approach 
based on a system of mobile agents. Further developments were given, namely in what 
concerns the analytical expression which fits the typical degree distributions observed in 
empirical social networks. We gave evidence that such distributions follow a Brody 
distribution which depends on a single parameter that scales with the average degree of the 
network. A question which now remains to be answered is how to derive such distribution 
from an analytical and meaningful approach. 

Showing that the usual clustering coefficient is, in general, inappropriate when 
addressing the clustering properties of social networks, we described a suitable measure to 
access these properties and presented its additional applications for estimating the distribution 
of cycles of higher order. This additional clustering coefficient was also put in a general 
framework with different other higher-order coefficients, that could be useful for particular 
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situations of multipartite networks. An expansion combining all possible coefficients was 
also proposed, motivated by previous worksflU, which depends only on the degree distribution 
and degree-degree correlations. However, computational effort to compute such coefficients 
increases exponentially with their order and therefore it is not yet clear how useful such an 
expansion may be. 

Finally, to study dynamical processes in social networks, in particular the propagation of 
information, two simple measures were introduced. Namely, a spread factor, which measures 
the maximal relative size of the neighborhood reached, when the information starts from a 
local source (node), and a spreading time, which gives the number of sufficient steps to reach 
such maximal size. This two measures gave rise to introduce a minimal model for gossip 
propagation, which can be seen as a particular model of opinions. Within this specific model, 
the spread factor was found to be minimized by a particular non-trivial degree of the source, 
which is related to the degree-degree correlations arising in the network. If such possibility 
of minimizing the danger of being gossiped can be tested in a real situation and which other 
implications these findings have in other situations - e.g. in internet virus propagation - remain 
open questions for forthcoming studies. 
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